Reduction of Intermediate Alphabets in Finite-State Transducer Cascades
نویسنده
چکیده
This article describes an algorithm for reducing the intermediate alphabets in cascades of finite-state transducers (FSTs). Although the method modifies the component FSTs, there is no change in the overall relation described by the whole cascade. No additional information or special algorithm, that could decelerate the processing of input, is required at runtime. Two examples from Natural Language Processing are used to illustrate the effect of the algorithm on the sizes of the FSTs and their alphabets. With some FSTs the number of arcs and symbols shrank considerably.
منابع مشابه
Minimum Inferred Finite State
11 More work should be done on learning homomorphisms into larger alphabets. More importantly, it would be interesting to nd natural linguistic limitations of the type of morphological transformations such that the resultant learning problem would become tractable. References 1] Angluin, D. and C. Smith, \Inductive inference: theory and methods". Comput. QUESTION: Is there a K-state determinist...
متن کاملVariable Automata over Infinite Alphabets
Automated reasoning about systems with infinite domains requires an extension of automata, and in particular, regular automata, to infinite alphabets. Existing formalisms of such automata cope with the infiniteness of the alphabet by adding to the automaton a set of registers or pebbles, or by attributing the alphabet by labels from an auxiliary finite alphabet that is read by an intermediate t...
متن کاملEfficient Online k-Best Lookup in Weighted Finite-State Cascades
Weighted finite-state transducers (WFSTs) have proved to be powerful and efficient aids for a variety of natural-language processing tasks, including automatic phonetization and phonological rule systems (Kaplan & Kay, 1994; Laporte, 1997), morphological analysis (Geyken & Hanneforth, 2006), and shallow syntactic parsing (Roche, 1997). In particular, cascades arising from the composition of two...
متن کاملMinimization of Symbolic Transducers
Symbolic transducers extend classical finite state transducers to infinite or large alphabets like Unicode, and are a popular tool in areas requiring reasoning over string transformations where traditional techniques do not scale. Here we develop the theory for and an algorithm for computing quotients of such transducers under indistinguishability preserving equivalence relations over states su...
متن کاملKNG: a Tool for Writing Easily Transducer Cascades (KNG: un outil pour l'écriture facile de cascades de transducteurs) [in French]
Résumé. Cet article présente une bibliothèque python appelée KNG permettant d’écrire facilement des automates et transducteurs finis. Grâce à une gestion soigneuse des codages et des entrées-sorties, cette bibliothèque permet de réaliser une cascade de transducteurs au moyen de tubes unix reliant des scripts python. Abstract. This paper presents a Python library called KNG which provides facili...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره cs.CL/0010030 شماره
صفحات -
تاریخ انتشار 2000